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Abstract 

The members of the paired box (Pax) family regulate key developmental pathways in many metazoans as tissue-specific transcription 
factors. Vertebrate genomes typically possess nine Paxgenes {Paxl-9), which are derived from four proto-Pax genes in the vertebrate 
ancestor that were later expanded through the so-called two-round (2R) whole-genome duplication. A recent study proposed that 
paxGa genes of a subset of teleost fishes (namely, acanthopterygians) are remnants of a paralog generated in the 2R genome 
duplication, to be renamed paxGJ, and reported one more group of vertebrate Pax genes {Pax6.2), most closely related to the Pax4/6 
class. We propose to designate this new member Pax10 instead and reconstruct the evolutionary history of the Pax4/6/1 0 class with 
solid phylogenetic evidence. Our synteny analysis showed that Pax4, -6, and -10 originated in the 2R genome duplications early in 
vertebrate evolution. The phylogenetic analyses of relationships between teleost paxGa and other Pax4, -6, and -/O genes, however, 
do not support the proposed hypothesis of an ancient origin of the acanthopterygian paxGa genes in the 2R genome duplication. 
Instead, we confirmed the traditional scenario that the acanthopterygian paxGa is derived from the more recent teleost-specific 
genome duplication. Notably, PaxG is present in all vertebrates surveyed to date, whereas Pax4 and -10 were lost multiple times in 
independent vertebrate lineages, likely because of their restricted expression patterns: Among Pax6-positive domains, PaxlO has 
retained expression in the adult retina alone, which we documented through in situ hybridization and quantitative reverse transcrip- 
tion polymerase chain reaction experiments on zebrafish, Xenopus, and anole lizard. 
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Introduction 

The vertebrate gene repertoire was shaped by two rounds (2R) 
of whole-genome duplications (WGD) early in vertebrate evo- 
lution (Ohno 1970; Lundin 1993; Holland et al. 1994). These 
events initially generated four paralogs in vertebrates corre- 
sponding to a single invertebrate ortholog, and subsequent 
processes such as neofunctionalization, subfunctionalization, 
or complete loss of function modified this initial four-fold ge- 
netic abundance of evolutionary raw material. Genes that play 
crucial roles in development tend to be highly conserved and 
are therefore present in the genomes of diverse vertebrates. In 
contrast, genes that are less crucial for development and sur- 
vival experience less selective pressure in the form of balancing 
selection, and are hence permitted to be differentially lost in 



vertebrate lineages (Lynch et al. 2001). Examples are the 
BmpIG gene, a sister gene of the highly conserved Bmp2 
and -4 genes (Reiner et al. 2009), which has only been 
found in teleosts so far, and the Pdx2 gene, a duplicate of 
the pancreatic key regulator Pdxl , which is only retained in 
cartilaginous fish and coelacanths (Mulley and Holland 201 0). 

Members of the Pax gene family encode transcription fac- 
tors that play crucial regulatory roles in metazoan develop- 
ment (reviewed in Wehr and Gruss 1996). All vertebrate Pax 
proteins identified to date are characterized by the possession 
of a paired domain (Breitling and Gerber 2000; Underbill 
2012). They are divided into four classes, namely Pax1/9, 
Pax3/7, Pax2/5/8, and Pax4/6, based on the completeness of 
a homeodomain and the presence of an octapeptide motif 
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(fig. Wehr and Gruss 1996; Chi and Epstein 2002). The 
last common ancestor of bilaterians, the so-called 
"Urbilateria," already possessed proto-orthologs of these 
four Pax classes, plus an additional class, the PaxA/Pox neuro 
class, restricted to invertebrates (Matus et al. 2007). Preceding 
the radiation of vertebrates, each of the four classes was qua- 
druplicated by the 2R-WGD (Wada et al. 1998; Holland et al. 
1999; Ogasawara et al. 1999; Manousaki et al. 201 1). 

The Pax4/6 class of genes consists of Pax4 and -6 genes as 
well as a recently identified gene named Pdx6.2 (Ravi et al. 
2013). This novel gene has only been identified in a chon- 
drichthyan (the elephant shark, Callorhinchus milii), several 
teleost fish, a reptile (the green anole, Anolis carolinensis), 
and an amphibian (the frog, Xenopus tropicalis; fig. IS; Ravi 
et al. 201 3). Pax4, the other close relative of Pax6 (Manousaki 
et al. 201 1), also shows a mosaic pattern of phylogenetic dis- 
tribution, confined to mammals (Pilz et al. 1 993; Tamura et al. 
1994) and teleost fish (Hoshiyama et al. 2007; Manousaki 
et al. 201 1), whereas the Pax6 gene is identified in every ver- 
tebrate genome sequenced to date (fig. ^B). Ravi et al. (2013) 
also recently called the phylogeny of teleost fish pax6 genes 
into question. Acanthopterygii is a group of teleost fish, and 
among the typical laboratory animals with the sequenced ge- 
nomes, it includes medaka, Fugu, and stickleback, but not 
zebrafish. They concluded that acanthopterygian paxGa and 



-6b genes did not originate in the so-called "third-round" 
WGD in the teleost lineage (teleost-specific genome duplica- 
tion, TSGD; Amores et al. 1 998; Wittbrodt et al. 1 998; Meyer 
and Van de Peer 2005). Instead, they proposed that acanthop- 
terygian paxGa genes originated in a more ancient event, 
namely the 2R-WGD at the base of vertebrate evolution 
(fig. 2/\; Ravi et al. 2013). Importantly, this would imply that 
the acanthopterygian paxGa genes are not orthologous to all 
other vertebrate Pax6 genes. To demarcate acanthopterygian 
paxGa genes and emphasize their ancient origin, Ravi et al. 
(2013) therefore proposed to rename them paxGJ. 

The Pax6 gene is famous for its essential role as "master 
control gene" for eye development. Studies in the 1990s re- 
vealed the ability of an ectopically expressed mouse Pax6 gene 
to induce ectopic eyes in Drosophila (Haider et al. 1 995). Apart 
from this inductive role in eye development, the vertebrate 
Pax6 gene is involved in the development of the central ner- 
vous system (CNS), including fore- and hindbrain, the neural 
tube, the pituitary, the nasal epithelium, and the endocrine 
part of the pancreas (Walther and Gruss 1 991 ; St-Onge et al. 
1997). In zebrafish, paxGb, but not paxGa {"paxGJ" in Ravi 
et al. 2013), is expressed in the endocrine pancreas (Delporte 
et al. 2008). The vertebrate paralog of Pax6, the Pax4 gene, is 
necessary for the differentiation of insulin-producing p-cells in 
the endocrine part of the mammalian pancreas (Sosa-Pineda 
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Fig. 1. — Domain structure of vertebrate Pax proteins and phylogenetic distribution of Pax4, -6, and -10 genes across jawed vertebrates. (A) Presences of 
paired domains (PD), homeodomains (HD), and octapeptides (0) for all vertebrate Pax subtypes. No paired box has been identified in any of the genes, 
and thus, mature Paxi 0 proteins presumably lack a paired domain. (B) Phylogeny of major vertebrate taxa with indicated patterns of presence and presumed 
absence of Pax4, -6, and -10 genes. The presence of these genes was investigated using exhaustive Blast searches in publicly available whole-genome 
sequences (see supplementary table 51, Supplementary Material online, for details). The chondrichthyan PaxlO gene was reported by Ravi et al. (2013). 
Inferred secondary gene losses are indicated with red and blue crosses and mapped onto the generally accepted jawed vertebrate phylogeny. Question 
marks indicate uncertainties about the absence of genes because of insufficient sequence information of the respective taxa. The phylogenetic position of 
turtles is based on molecular phylogenetic studies (Zardoya and Meyer 1998; Rest et al. 2003; Iwabe et al. 2005; Chiari et al. 2012; Crawford et al. 2012; 
Wang et al. 2013). 
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Fig. 2. — Phylogenetic relationships within the Pax4/6/1 0 class of genes. {A) Schematic presentations of two scenarios of the evolution of the Pax4/6/1 0 
class of genes. Hypothesis 1, proposed by Ravi et al. (2013), assumes an ancient origin of one group of acanthopterygian pax6 genes, namely pax6.3. In 
addition, this hypothesis does not take the group of Pax4 genes into account. Hypothesis 2 is proposed based on our phylogenetic analysis, and it postulates 
the origin of both groups of teleost pax6 genes, namely pax6a and -6b, in the TSGD. The gene nomenclature in Hypothesis 1 is adopted from Ravi et al. 
(201 3). (B) ML tree showing phylogenetic relationships among vertebrate genes belonging to the Pax4/6/1 0 class, with Pax3 and -7 genes as outgroup. Exact 
names of the genes are included only when they experienced additional lineage-specific duplications. Support values are shown for each node in order, 
bootstrap probabilities in the ML and Bayesian posterior probabilities. Only bootstrap probabilities above 50 are shown. The inference is based on 99 amino 
acid residues, assuming the JTT+r4 model (shape parameter of the gamma distribution a = 0.57). The scale bar on the upper left indicates 0.2 substitutions 
per site. 



Genome Biol. Evol. 6(7):1 635-1 651 . doi:10.1093/gbe/evu135 Advance Access publication June 19, 2014 



1637 



Feineret al. 



GBE 



et al. 1997), but not of the zebrafish pancreas (Djiotsa et al. 
2012). This indicates evolutionarily conserved roles of PaxG 
and lineage-specific differences in the roles of Pdx4 during 
pancreas development among vertebrates. The Pax4 genes 
of mammals and teleost fishes are not implicated in the de- 
velopment of eyes (Rath et al. 2009; Manousaki et al. 201 1). 
However, mammalian Pax4 genes are expressed in photore- 
ceptors in the outer nuclear layer of the adult mammalian 
retina in a diurnal rhythm (Rath et al. 2009). The PaxG gene 
is expressed in other neuronal layers of the mature retina, 
namely the inner nuclear layer, the ganglion cell layer, and 
in several species including a shark, zebrafish, chicken, and 
mouse, and also in the horizontal cell layer (Belecky-Adams 
et al. 1 997; Macdonald and Wilson 1 997; Wullimann and Rink 
2001; Ferreiro-Galve et al. 201 1). The teleost pax4 gene has 
an expression domain that has not been attributed to any 
other Pax gene, namely the stomodeum which corresponds 
to the developing lip (Manousaki et al. 201 1). 

Information on the developmental roles of the novel PaxG 
relative, PaxG.2, is limited because it was found only very re- 
cently (Ravi et al. 201 3) and also because it is absent from the 
genomes of traditional model species. This recent study re- 
vealed the expression of the elephant shark PaxG.2 gene in 
its adult eye, shared with Pax4 and -6 genes, and the adult 
kidney in which none of the other PaxG relatives have been 
shown to be expressed (Ravi et al. 201 3). Its zebrafish ortholog 
is expressed in the head region during early developmental 
stages, and becomes restricted to the inner nuclear layer of 
the retina during late developmental stages (Ravi et al. 2013). 

In this study, we assessed the molecular phylogeny of the 
recently identified relative of the PaxG gene {PaxG.2 in Ravi 
etal. 201 3) and propose to call it Pax/O instead. By conducting 
rigorous molecular phylogenetic analyses and considering 
conserved synteny, we demonstrate that Pax4, -6, and -10 
originated in the 2R-WGD, and that the two paxG genes of 
teleost fishes {paxGa and paxGb) were duplicated in the TSGD. 
Our gene expression analyses in the zebrafish, a frog, and the 
green anole lizard revealed the PaxlO expression in the adult 
retina and brain and suggested that PaxlO presumably plays 
no role during early development. Our reanalysis provides a 
synthetic understanding of the evolution of the vertebrate 
Pax4/6/10 gene repertoire and the functional partitioning of 
these genes. 

Materials and Methods 

in Silico Identification of Novel Pax Genes 

Using the green anole PaxW and human Pax4 nucleotide se- 
quences as queries, we performed BLASTn searches in the 
NCBI dbEST (National Center for Biotechnology Information) 
as well as in the nr/nt database and in nucleotide genomic 
sequences of species included in the EnsembI genome brow- 
ser. Similarly, local BLASTn searches were performed in 



downloaded genome-wide or transcriptomic sequence re- 
sources of three cyclostomes (inshore hagfish, sea lamprey, 
and Japanese lamprey), three chondrichthyans (elephant 
shark, little skate, and cloudy catshark), a basal actinopterygian 
(spotted gar), the African coelacanth, reptiles (American alli- 
gator, Burmese python, Chinese soft-shell turtle, painted 
turtle, and saltwater crocodile), and birds (budgerigar, collared 
flycatcher, mallard, medium ground finch, turkey, and zebra 
finch; see supplementary table SI, Supplementary Material 
online, for details). Candidate sequences with E value of less 
than 1e-05 in these BLASTn searches were subjected to pre- 
liminary phylogenetic tree inferences. Sequences placed inside 
the Pax4/6/1 0 class were selected, and their longest open read- 
ing frames (ORFs) were curated either manually or by using the 
gene prediction program AUGUSTUS (Stanke et al. 2004). All 
sequences identified in this in silico screen, except for saltwater 
crocodile PaxlO and Japanese lamprey PaxGB (supplementary 
tables S2 and S3, Supplementary Material online), are depos- 
ited in EMBL under accession numbers HF567444-HF567455. 

Animals 

Wild-type embryos and albino adults of the zebrafish {Danio 
rerio) and embryos (staging according to Nieuwkoop and 
Faber 1994) and adults of Xenopus laevis were obtained 
from captive breeding colonies. Eggs of the green anole 
lizard {A. carolinensis) were harvested from in-house captive 
breeding colonies and incubated at 28 °C and approximately 
70% humidity until they reached required stages determined 
after Sanger et al. (2008). Animals that were subjected to 
sectioning or whole-mount in situ hybridizations were stored 
in methanol after fixation in 4% paraformaldehyde or Serra's 
fixative, respectively. All experiments were conducted in ac- 
cordance with the animal use protocols of the University of 
Konstanz. 

Reverse Transcription Polymerase Chain Reaction 

Total RNA of embryos of zebrafish (24 hpf), Xenopus (1 1 
individuals for each of stages 22, 30, and 40), and the 
green anole (stage 8.5) were extracted and reverse transcribed 
into cDNA using Superscript III (Invitrogen), following the 
instructions of the 3^-RACE System (Invitrogen). 
Oligonucleotide primers were designed based on PaxlO 
transcript sequences of zebrafish (ENSDART00000075395), 
X. tropicalis (ENSXETT00000065934), and the green anole 
(ENSACAT0000001 3868) to amplify full-length cDNAs of zeb- 
rafish paxlOa, X. laevis PaxlOa and -10b, and the green anole 
PaxlO genes. 3^-ends of these four fragments were isolated 
by applying the 3^-RACE System (Invitrogen). 5^-ends of the 
green anole PaxlO cDNA were obtained using the GeneRacer 
Kit (Invitrogen) and 5^-ends of the zebrafish paxlOa and 
X. laevis PaxlOa and -10b were obtained using the 5^-RACE 
System (Invitrogen). Full-length PaxlO cDNA sequences of 
zebrafish, X. laevis, and green anole were assembled from 



1638 Genome Biol. EvoL 6(7):1 635-1 651 . doi:10.1093/gbe/evu135 Advance Access publication June 19,2014 



Evolution of the Vertebrate Pax4/6/1 0 Class of Genes 



GBE 



two fragments and are deposited in EMBL under accession 
numbers HF567440-HF567443. 

In order to detect Pax6 expression in the eyes of adult 
zebrafish, Xenopus, and the green anole, cDNA fragments 
covering the respective 3'-ends were isolated using the 
3^-RACE System (Invitrogen). Primers were designed 
based on PaxG transcript sequences of zebrafish (EnsembI 
ID: ENSDARTOOOO0 148420 for paxGa and 
ENSDARTOOOO0 145946 for paxGb), X. laevis (NCBI: 
NM_001 085944 for PaxGa and NM_001 172195 for PaxGb), 
and the green anole (EnsembI ID: ENSACAT00000002377). 
Zebrafish pax4 probe synthesis employed the cDNA fragment 
previously reported (Manousaki et al. 201 1). 

In order to analyze differential expression of PaxG and -10 
genes between various organs of adult zebrafish, Xenopus 
and the green anole, and between embryonic stages of de- 
veloping zebrafish, the animals were dissected. Total RNA was 
extracted using TRIzol (Invitrogen) and treated with DNase I. 
The integrity of the extracted RNA was monitored using the 
Bioanalyzer 2100 (Agilent). Gene-specific primers to amplify 
approximately 200-bp-long cDNA fragments oi PaxG, -10, and 
GAPDH we^e designed (supplementary table S4, Supplemen- 
tary Material online). It should be noted that primers for the 
amplification of X laevis PaxG and -10 were designed to cap- 
ture both "a" and "b" paralogs. The specificity of each primer 
pair was determined in a preliminary pilot polymerase chain 
reaction, and the amplification of comparable amounts of 
GAPDH cDNA fragments was used as proxy for similar quan- 
tity of cDNAs between samples. Semiquantitative reverse 
transcription (RT)-PCR runs were conducted using the 
DreamTaq^^ DNA Polymerase (Fermentas). A predenaturing 
step at 95 °C for 3 min was followed by 35-50 cycles of three 
steps (95 °C for 30s, 58 °C for 1 min, and 72 °C for 1 min). 
The amount of PGR product was visualized using standard 
agarose gel electrophoresis (supplementary fig. SI, Supple- 
mentary Material online). The intensity of individual bands 
was quantified using GelQuant.NET software (http://bioche 
mlabsolutions.com, last accessed June 26, 2014). Expression 
levels of PaxG and -10 genes were normalized to relative ex- 
pression levels of GAPDH. The resulting values were further 
normalized to tissues with the highest expression level for the 
respective gene that was defined as 1 . These values were 
shown in a heat map generated by GIMMiner (http://dis- 
cover.nci.nih.gov/cimminer, last accessed June 26, 2014), 
which was also used for the clustering of genes based on 
expression patterns with an Euclidean distance algorithm. 

In Situ Hybridization 

The aforementioned 3^- and 5^-cDNA fragments of zebrafish, 
Xenopus, and green anole PaxlO, 3^-fragments of zebrafish, 
Xenopus, and green anole PaxG, and 3^-fragment of zebrafish 
pax4 were used as templates for riboprobe synthesis. Whole- 
mount in situ hybridizations for two PaxlO probes and a PaxG 



probe as positive control were performed using embryonic 
zebrafish (Begemann et al. 2001), Xenopus (Gawantka et al. 
1 995), and green anole (Di-PoT N, personal communication). In 
situ hybridizations on paraffin-embedded sections for adult 
eyes of zebrafish, Xenopus, and green anole were performed 
using the aforementioned riboprobes as previously described 
(Kuraku et al. 2005). 

Molecular Phylogenetic Analysis 

An optimal multiple alignment of all retrieved cDNA sequences 
was constructed using MEGA5 (Tamura et al. 201 1), in which 
the MUSGLE program (Edgar 2004) is implemented. Molecular 
phylogenetic trees were inferred using regions of selected ver- 
tebrate Pax amino acid sequences that were unambiguously 
aligned with no gaps (supplementary table S5, Supplementary 
Material online). Several sequence fragments of previously uni- 
dentified Pax4/G/10 genes (supplementary table S6, 
Supplementary Material online) were excluded from the phy- 
logenetic analysis depicted in figure 2B, but their affiliation to 
individual Pax subtypes was verified in separate phylogenetic 
analyses (data not shown). The human, C/b. intestinalis, and fly 
genes belonging to the Pax3/7 gene class served as outgroup 
because of the lack and truncation of the homeodomains of 
Pax1/9 and Pax2/5/8 class of genes, respectively. MEGA5 was 
used to determine the optimal amino acid substitution model 
and to reconstruct maximum likelihood (ML). Bayesian tree in- 
ference was performed using MrBayes 3.1 (Huelsenbeck and 
Ronquist 2001), with which we ran two independent chains 
with 5,000,000 generations for each, sampled every 1 00 gen- 
erations, and excluded 25% of the samples as burnin. 
Convergence of the two chains was diagnosed using the 
Tracer v1 .5 software (http://tree.bio.ed.ac.uk/software/tracer/, 
last accessed June 26, 2014). The main data set consisted of a 
broad selection of metazoan genes belonging to the Pax4/6/1 0 
class of genes (fig. 2B). In addition, we performed a phyloge- 
netic analysis focusing on vertebrate PaxG genes in which am- 
phioxus PaxG genes served as outgroup (supplementary fig. S2, 
Supplementary Material online). To assess the statistical sup- 
port for alternative hypotheses (fig. 2A), per site log likelihoods 
of two constrained tree topologies and the ML tree were cal- 
culated using Tree-Puzzle (Schmidt et al. 2002) and statistically 
assessed using GONSEL(Shimodaira and Hasegawa 2001). 

Identification of Conserved Synteny 

Using the EnsembI database, we identified genes within a 
1-Mb region flanking paxG in the genome of the spotted 
gar. Orthologs of these genes in zebrafish, stickleback, and 
medaka were downloaded via the BioMart interface and plot- 
ted against the focal region of the spotted gar (fig. 3A). The 
three gene families that possess members both in the vicinity 
of paxGa and -Gb genes were further analyzed. ML trees were 
reconstructed as described above to infer the evolutionary 
origin of the teleost duplicates (fig. 3B). 
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Fig. 3. — Evolutionary origins of pax6-neighboring genes in teleost genomes. (A) Conserved synteny between the genomic regions of the gar Pax6 
ortholog and the two teleost pax6 subtypes. Co-orthologs to the spotted gar genes located within 1 Mb flanking the pax6 gene that are harbored in the 
vicinity of paxGa and/or -6b of selected teleosts are shown in the same color (see supplementary table S9, Supplementary Material online, for accession IDs). 
Asterisks mark names of the selected gene families that retained both duplicates derived from the TSGD. {B) Molecular phylogenetic trees of selected gene 
families. ML trees of three gene families are shown, and statistical support values (P values) of alternative scenarios assuming a more ancient origin of 
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The search for conserved intragenonnic synteny (Kuraku 
and Meyer 2012) in the green anole focused on up to 
10-Mb genonnic regions flanking the Pax6 and -10 genes. 
Referring to the EnsembI "Gene Tree," the duplication pat- 
terns of neighboring gene families were investigated. We 
identified two additional chromosomal regions that harbor a 
similar gene array. A second search for paralogous relation- 
ships between these two identified chromosomal regions was 
conducted. The identified pairs, triplets, and quartets of para- 
logs located on the four chromosomes were plotted (fig. 4). 

In order to analyze the mode of the putative loss of PaxlO 
in the mammalian and avian lineages, we downloaded a list of 
EnsembI IDs of genes harbored in the 500-kb genomic regions 
flanking the green anole PaxW gene, together with IDs of 
opossum, human, and chicken orthologs of those genes via 
the BioMart interface. Opossum, human, and chicken ortho- 
logs located on chromosomes 4, 19, and 27, respectively, 
were plotted along the orthologous region on anole chromo- 
some 6 (fig. 5 and supplementary tables S7 and S8, Supple- 
mentary Material online). 

Results 

identification of Novel Pax4/6/10 Genes in Diverse 
Vertebrates Except for Mammals and Birds 

We identified a group of protein-coding genes closely related 
to Pax4 and -6 in the Gene Tree view in release 54 of 
the EnsembI genome database (http://www.ensembl.org, last 
accessed June 26, 2014; released in spring 2009; Hubbard etal. 
2009). In this tree at EnsembI, these uncharacterized genes 
possessed by zebrafish (ENSDARG00000053364) and stickle- 
back {Gasterosteus aculeatus; ENSGACG00000007835) were 
placed basal to the vertebrate Pax6 group of genes. More 
recent EnsembI releases (e.g., release 66) suggest that this 
gene is also possessed by X. tropicalis 
(ENSXETG00000032534), the green anole lizard 
{A. carolinensis; ENSACAG00000013797), Atlantic cod 
{Gadus morhua; ENSGMOG00000007260), and Nile tilapia 
{Oreochromis niloticus; ENSONIG00000020082). Following 
the conventional nomenclature of the Pax gene family {Pax1- 
9) and our intensive phylogenetic analysis (see below), we 
designated this group of genes as Paxi 0. Our survey of publicly 
available genome-wide sequence resources (supplementary 
table SI, Supplementary Material online) resulted in the iden- 
tification of PaxW orthologs in the painted turtle {Chrysemys 
picta bellii; Shaffer et al. 2013), the Burmese python {Python 



molurus; Castoe et al. 201 1), the American alligator {Alligator 
mississippiensis; St John et al. 2012), the saltwater crocodile 
{Crocodylus porosus; St John et al. 2012), the coelacanth 
{Latimeria chalumnae; Amemiya et al. 201 3), and the elephant 
shark (C. milii; Ravi etal. 2013). Moreover, \nAII. mississippien- 
sis, Chr. picta bellii, and L. chalumnae, we identified orthologs 
of the Pax4 gene which was previously identified only in teleosts 
and mammals (see supplementary tables SI, S5, and S6, 
Supplementary Material online, for details). The Pax4 genes 
of the Chinese soft-shell turtle {Pelodiscus sinensis; 
ENSPSIG000000061 91 ) and the spotted gar {Lepisosteus ocu- 
latus; ENSGACP000000261 54_1 at EnsembI Pre) are available 
in the EnsembI genome database since release 68 and in Pre- 
Ensembl, respectively. So far, no Pax4 ortholog has been iden- 
tified in transcriptomes and genomes of birds, lepidosaurs, am- 
phibians, and chondrichthyans, whereas PaxW is most likely 
absent from the genomes of mammals and birds (fig. IS). In 
addition to the previously described Japanese lamprey Pax6 
gene (AB061220 in NCBI nucleotide), we identified another 
Pax6-\\V£ sequence in the genome assembly of this species 
(Mehta et al. 2013) that we designated PaxGB. The putative 
ORF of this sequence is disrupted by a stretch of N's and thus is 
partial. Therefore, this Japanese lamprey PaxGB gene was not 
included in the phylogenetic analysis of the entire Pax4/6/10 
class of genes (fig. 2), whereas it was included in the phyloge- 
netic analysis of its subset, PaxG (supplementary fig. S2, 
Supplementary Material online). We would like to alert our 
readers to potential misidentification of these genes, as the 
novel relatives of PaxG, PaxW genes, are sometimes annotated 
as PaxG or PaxG-like in EnsembI. 

Domain Structure of the Deduced Pax10 Protein 
Sequences 

The deduced amino acid sequences of the PaxW genes, iden- 
tified above in silico, contain a complete homeodomain pre- 
ceded by a putative start codon, but lack the characteristic 
paired domain common to all other Pax proteins (fig. ^A). To 
rule out the possibility that incomplete annotations of genome 
assemblies led to the nonidentification of the paired domain, 
we performed extensive searches with tBLASTn in nucleotide 
sequences of the selected genomic contigs containing PaxW 
using peptide sequences of the paired domain as queries. 
However, this approach did not reveal any potential paired 
domain upstream of the identified PaxW ORFs. By means of 
RT-PCR, we isolated full-length cDNAs of a single PaxlO 
member in the zebrafish (designated paxWa by Ravi et al. 



Fig. 3. — Continued 

the pax6a-linked genes (compatible with Hypothesis 1 in fig. 2A) are given in the grey box on the lower left as inset. The DepdcV phylogeny is based on 321 
amino acid residues and the JTr+r4 model (shape parameter of the gamma distribution a = 1 .39) was assumed. The ML tree of Wtl genes was inferred from 
331 amino acid residues assuming the JTT+F+r4 model (shape parameter of the gamma distribution a = 1 .45). The Kiddl549l phylogenetic analysis used an 
alignment of 295 amino acid residues and the JTT+r4 model (shape parameter of the gamma distribution a = 1 .30). Bootstrap probabilities are provided for 
each node. The scale bars on the upper left of each phylogenetic tree indicate 0.2 substitutions per site. 
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Fig. 4. — Intragenomic conserved synteny between Pax6 and - /O containing regions in the green anole lizard. Outer grey bars represent chromosomes 1 
and 6 of the green anole that harbor Pax6 and - (shown in red), respectively, and their paralogous regions on chromosomes 4 and 5. Magnifications of the 
genomic regions indicated on the chromosomes (gray bars) are shown in the center. Gene-by-gene paralogies among the four members of the quartet are 
highlighted with diagonal lines: Gray lines for paralogy of gene families with two paralogs, blue lines for three paralogs, and green line for four paralogs. 



2013 based on a comparison of syntenic relationships among 
teleosts) and A. carolinensis. We also isolated two full-length 
cDNAs of X laevis (designated PaxWa and -10b) — evidence of 
transcription of these genes. In silico translation of these cDNA 
sequences confirmed the lack of a paired domain in the prod- 
ucts of these PaxlO genes. 

Phylogenetic Relationships among Pax4, -6, and -10 

We performed a molecular phylogenetic analysis employing a 
data set containing the members of the Pax4/6/10 class cov- 
ering major metazoan taxa as well as human, Go. intestinalis, 
and fly Pax3/7 (paired) genes as outgroup (see supplementary 
table S5, Supplementary Material online). The inferred ML tree 



(fig. 2B) placed the putative Nematostella vectensis Pax6 
ortholog outside the monophyletic group containing bilaterian 
Pax4, -6, and -10 genes. Within this group, gnathostome 
PaxlO genes formed a monophyletic group (bootstrap prob- 
ability in the ML analysis, 98). Except for the opossum Pax6, 
gnathostome Pax6 genes formed a monophyletic group 
that was inferred to be a sister group of the PaxlO group of 
genes (bootstrap probability in the ML analysis, 30). The two 
previously identified cyclostome Pax genes, namely Pax6 of 
the inshore hagfish {Eptatretus burgeri) and Pax6 of the 
Japanese lamprey {Lethenteron Japonicum, also referred to 
as Arctic lamprey Lethenteron camtschaticum), form an exclu- 
sive cluster that is placed at the base of the gnathostome Pax6 
and -10 subgroups (fig. 2B). It should be noted that the newly 
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Fig. 5. — Conserved synteny between the Pax7 0-containing region in the green anole and its orthologous regions in mammals and birds. A 1-Mb region 
flanking the green anole PaxlO gene, shown in red, was analyzed and gene-by-gene orthologies are indicated with gray lines. (A) Conserved synteny 
between the green anole Pax/ 0-containing region and its orthologous regions in the opossum and human genomes. The dense pattern of one-to-one 
orthologies suggests a small-scale deletion of PaxW in the lineage leading to eutherians, before the split of the marsupial lineage. (B) Synteny between the 
green anole Pax/ 0-containing region and the chicken genome. The lack of one-to-one orthologies in the region around the green anole PaxlO gene is best 
explained by a large-scale deletion in the avian lineage. Green anole genes whose orthologs are located elsewhere in the opossum, human, or chicken 
genome are indicated with blue bars, whereas green anole genes lacking orthologs in these genomes are shown with grey bars. Exact genomic locations and 
accession IDs of the identified orthologs are included in supplementary tables S7 and S8, Supplementary Material online, chr., chromosome. 



identified L. japonicum PaxGB gene and a Pax6 gene 
(AAM18642.1 in NCBI) of a different lamprey species 
{Lampetra fluviatilis) were excluded from this analysis because 
of their incomplete sequence. However, a phylogenetic anal- 
ysis focusing on vertebrate Pax6 genes including the L. japo- 
nicum PaxGB gene supported its grouping with the inshore 
hagfish Pax6 gene (bootstrap probability in the ML analysis, 
73; supplementary fig. S2, Supplementary Material online). 
The previously identified L. japonicum Pax6 gene showed 
higher affinity to gnathostome Pax6 genes (bootstrap proba- 
bility in the ML analysis, 54; supplementary fig. S2, 
Supplementary Material online). The phylogenetic relation- 
ships among invertebrate PaxG/eyeiess genes barely reflect 
the generally accepted species phylogeny (fig. IB). The 
group of Pax4 genes forms a monophyletic cluster that is 
placed outside all other bilaterian PaxG and -10 genes. 
Support values for this inferred tree are generally low. This 
poorly resolved tree topology prevents us from deriving 
more clear-cut conclusions about the phylogenetic relation- 
ships among the individual Pax4, -6, and -10 subgroups. In 
particular, the phylogenetic position of the acanthopterygian 
paxGa ipaxGJ in Ravi et al. 2013) genes in relation to other 
vertebrate PaxG genes could not be reliably inferred by this 



analysis. Although the ML analyses suggest an origin of 
acanthopterygian paxGa and -Gb genes through the TSGD 
(fig. 2B), the support for this duplication node (bootstrap prob- 
ability in the ML analysis, 18) is poor. 

To test the hypothesis postulating an ancient origin of 
acanthopterygian paxGa (paxG.3) genes (fig. 2A; Ravi et al. 
201 3), we complemented the heuristic ML tree reconstruction 
with statistical tests. We simulated tree topologies for alter- 
native scenarios supporting an early (Hypothesis 1; Ravi et al. 
201 3) or late (Hypothesis 2) origin of acanthopterygian paxGa 
ipaxGJ) genes (fig. 2A) and computed likelihoods for them. 
Although the latter hypothesis yielded larger likelihood values, 
no significant difference was observed in the levels of support 
among the two hypotheses: The P values of Hypothesis 1 are 
0.12 (approximately unbiased [AU] test) and 0.42 
(Shimodaira-Hasegawa [SH] test), and those of Hypothesis 2 
are 0.38 (AU test) and 0.44 (SH test). Thus, the question about 
the evolutionary origin of acanthopterygian paxGa (paxG.3) 
genes cannot be solely answered with a molecular phyloge- 
netic approach. Therefore, we investigated gene orders in the 
genomic regions flanking teleost paxG genes. 

We detected conserved synteny between the genomic 
region containing the single PaxG gene in the spotted gar 
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and those containing paxGa and -6b genes in medaka, stick- 
leback and zebrafish (fig. 3A and supplementary table S9, 
Supplementary Material online). The TSGD presumably 
occurred after the split of teleost fish from the lineage of 
Lepisosteiformes, including the spotted gar (Hoegg et al. 
2004; Crow et al. 2006; Hurley et al. 2007; Amores et al. 
201 1 ). Thus, this one-to-two relationship between the spotted 
gar and teleost fish matches the pattern resulting from the 
TSGD. We also investigated the molecular phylogeny of the 
neighboring gene families (fig. 3B). We sought to determine 
the timing of duplication that gave rise to pairs of genes flank- 
ing acanthopterygian paxGa and -6b, namely in the DepdcV, 
Wtl , and Kiaal549l gene families. The inferred ML trees sug- 
gested a duplication that gave rise to the teleost paralogs in 
question in the actinopterygian fish lineage (fig. 3B). Two of 
the gene families {Depdc7 and Kiaa1549f) indicated duplica- 
tions after the split of the spotted gar from the teleost stem 
lineage, coinciding with the TSGD. To further test this hypoth- 
esis, we computed likelihoods of alternative, non-ML tree to- 
pologies supporting the ancient origin of pax6a-linked 
ipaxGJ) genes at the base of vertebrate evolution. This anal- 
ysis rejected an ancient origin of pax6a-linked genes at the 
base of vertebrate evolution for two of the three gene families 
{Depdc7 and Kiaa1549f) with P values below 0.1 in the AU 
and SH tests. For the Wt1 gene family, this hypothesis was not 
significantly rejected (Pvalue of 0.29 in the AU test and 0.28 in 
the SH test), but was less likely than a more recent duplication 
in the TSGD (fig. 3B). Although not unambiguously inferred by 
our analysis, the timing of duplication of Wt1 genes in the 
TSGD has been demonstrated previously (Kluever et al. 2009). 
The synchrony of the duplications of these flanking genes 
suggests that they were duplicated in a large-scale event, 
probably in the TSGD. Hence, the paralogy between paxGa 
and -6b genes, embedded in this region, should have origi- 
nated through the TSGD (see fig. 2A). 

Are Pax4, -6, and -10 Derived from the 2R-WGD? 

Because our molecular phylogenetic analysis did not yield high 
confidence about the timing of diversification of the Pax4/6/ 
10 gene class, we employed a synteny analysis (see Materials 
and Methods). An intragenomic synteny analysis of the green 
anole lizard showed a dense pattern of gene-by-gene para- 
logies among four chromosomal regions including those con- 
taining Pax6 and -10 (fig. 4). The gene families whose 
duplications coincide with the 2R-WGD show a dense pattern 
of gene-by-gene paralogy among the identified regions on 
chromosomes 1, 4, 5, and 6 (fig. 4). In addition, five of the 
identified gene families were previously shown to have diver- 
sified early in vertebrate evolution (Manousaki et al. 2011). 
Interestingly, the orthologous regions in the human and 
chicken genomes were suggested to be derived from a 
single ancestral vertebrate chromosome, namely "D," re- 
ported by Nakatani et al. (2007). These results suggest that 



the quartet of the identified regions, on chromosomes 1 , 4, 5, 
and 6 in the green anole genome, originated from a large- 
scale quadruplication, probably as a result of the 2R-WGD. 

Mode of Secondary Loss of PaxlO in Mammals 

Our search for PaxlO genes in public databases and subse- 
quent phylogenetic analyses suggested the absence of this 
gene in mammals and birds (see figs. 1 B and 2). We employed 
intergenomic synteny analyses in order to determine whether 
the absence was caused by a large-scale deletion of a chro- 
mosomal segment or a small-scale gene loss. If it is caused by a 
major deletion, simultaneous loss of multiple neighboring 
genes should be observed. Our comparison between the 1- 
Mb genomic region flanking the PaxlO gene in the green 
anole, the human, and opossum genomes revealed conserved 
synteny (fig. 5/\). Of the 1 9 protein-coding genes contained in 
this region in the green anole, 1 5 and 14 genes, respectively, 
have orthologs in particular regions on human chromosome 
1 9 and opossum chromosome 4 (see supplementary table S7, 
Supplementary Material online, for exact positions of the 
orthologs). Notably, orthologs of one immediate neighbor of 
PaxlO (an uncharacterized gene encoding 776 amino acid- 
long putative peptide; ENSACAG00000013620) are also 
absent from mammalian genomes. Thus, the absence of the 
PaxlO gene from mammalian genomes was presumably 
caused by a small-scale deletion early in mammalian evolution 
before the divergence between the marsupial and eutherian 
lineages. If there was no massive chromosomal rearrange- 
ment that could have hindered our synteny analyses, this 
deletion event in the mammalian lineage could have involved 
as few as two genes. 

We also performed a synteny comparison between the 
green anole and the chicken. Of the 19 genes in the PaxlO- 
containing region in the green anole genome, only two were 
unambiguously shown to have orthologs in the chicken 
genome (see supplementary table S8, Supplementary 
Material online, for positions of the orthologs). Those two 
anole genes are more than 230 kb away from the PaxlO 
gene (fig. SB). The absence of the chicken orthologs of 
PaxlO and other genes in this genomic region in anole sug- 
gests that its counterpart region got lost in the lineage leading 
to chicken. Compared with a small-scale loss in the mamma- 
lian lineage, the loss in the avian lineage could have been 
larger in size, possibly involving more than ten genes. 

Expression Patterns of PaxlO in Zebrafish, Xenopus, and 
Green Anole 

We conducted semiquantitative RT-PCR using a developmen- 
tal series of zebrafish to characterize the temporal expression 
profile of paxlOa. This experiment suggested an onset of 
paxlOa expression at 25 h postfertilization (hpf) with the max- 
imal expression level at 5 days postfertilization (dpf; fig. 6A 
and supplementary fig. SI, Supplementary Material online). 
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Fig. 6. — Expression profiles of Pax4, -6, and -10 \n zebrafish, Xenopus, and green anole. (A) Expression levels of paxWa in a developmental series of 
zebrafish with semiquantitative RT-PCRs. Heat map indicates upregulation of paxWa in late developmental stages reaching a plateau at 5 dpt. Color scale at 
the right indicates the relative expression levels normalized to values between 0 and 1 for no expression (blue) and for the highest observed expression level 



(continued) 



Genome Biol. Evol. 6(7):1 635-1 651 . doi:10.1093/gbe/evu135 Advance Access publication June 19, 2014 



1645 



Feineret al. 



GBE 



This result prompted us to examine whether PaxW is ex- 
pressed in adult animals. Therefore, we analyzed PaxW ex- 
pression levels, as well as those of Pax6, in individual organs of 
zebrafish, Xenopus, and green anoles. Our comparison re- 
vealed strong PaxW expressions in the eyes of all three exam- 
ined animals and also subtle expression in testes oi Xenopus 
and brain tissue of zebrafish and the green anole (fig. 6B). It 
also revealed intensive Pax6 expressions in the eyes and 
weaker expressions in the brain, pancreas, and testis, which 
was common to all the animals examined (fig. 6^). We per- 
formed clustering of the analyzed genes based on their sim- 
ilarity in expression levels in the examined organs (fig. 6B). It 
partly recovered their phylogenetic relationship"closely related 
gene pairs (e.g., zebrafish paxGa and -6b, as well as Xenopus 
PaxW and zebrafish paxWa) also showed the highest similar- 
ity in expression patterns. 

In order to scrutinize the expression patterns of these 
genes, we performed in situ hybridizations on sectioned 
eyes of adult zebrafish, Xenopus, and green anoles. We per- 
formed expression analyses of pax4, -6a, -6b, and -Wa genes 
in zebrafish, Pax6 and - W in the green anole, and Pax6a and - 
Wb in X. laevis and identified distinct expression signals of all 
investigated genes in specific layers of the mature retina, 
namely, intensively in the inner nuclear layer and weakly in 
the ganglion cell layer (fig. 6C-R). In addition, we identified 
transcripts of zebrafish pax6a, -6b, and - Wa in the horizontal 
cell layer (fig. 6D, F, and H) and those of zebrafish pax4 in the 
outer nuclear layer (fig. 67). In the retina of the green anole, 
the expression domain of Pax6 in the inner nuclear layer was 
revealed to be nested within the broader area of PaxW-ex- 
pressing cells (fig. 6L and A/). 

Discussion 

Phylogeny of the Pax4/6/10 Class of Genes 

Only recently, a novel relative of Pax6, designated Pax6.2 
{PaxW in this study), was reported (Ravi et al. 2013). This 
study ignored Pax4, the other relative previously revealed to 
have been split from Pax6 in the 2R-WGD (Manousaki et al. 



201 1). Therefore, in this study, we included Pax4 genes in the 
data set and sought to reconstruct the evolutionary history of 
the entire Pax4/6/10 gene class. The group of Pax4 genes, 
with long branches, was placed outside the clade containing 
vertebrate and invertebrate Pax6 genes as well as genes 
(fig. 2B). Monophyly of Pax6 could not be unambiguously 
inferred (fig. 2B), and the group of PaxW genes was placed 
inside the vertebrate Pax6 genes. As shown for many other 
genes (Qiu et al. 201 1 ; Kuraku 2013), the phylogenetic posi- 
tions of the cyclostome Pax6 homologs were not unambigu- 
ously determined (fig. 2B). Still, Japanese lamprey Pax6B, 
newly reported in this study, was suggested to be an ortholog 
of the previously reported hagfish Pax6 gene (supplementary 
fig. S2, Supplementary Material online). The previously re- 
ported Japanese lamprey Pax6 was instead suggested to be 
orthologous to gnathostome Pax6 genes (supplementary fig. 
S2, Supplementary Material online). As for the position of 
acanthopterygian pax6b genes questioned by Ravi et al. 
(fig. 2A), our phylogenetic analysis (fig. 2B) supported the 
scenario that they originated in a duplication in the teleost 
fish lineage (Hypothesis 2) instead of the scenario that it orig- 
inated in the 2R-WGD (Hypothesis 1). 

Synteny analyses have been widely used as reliable tool to 
identify traces of WGD (Kuraku and Meyer 2012 and refer- 
ences therein). It is particularly useful when molecular phylog- 
enies of the involved genes cannot be reliably reconstructed. 
Our analysis demonstrated conserved synteny among four 
chromosomal regions including the two containing Pax6 
and -W, respectively (fig. 4). The duplications giving rise to 
this quartet, inferred by the diversification patterns of neigh- 
boring gene families (Manousaki et al. 201 1), occurred after 
the split of the tunicate and cephalochordate lineages but 
before the cyclostome-gnathostome split. This timing coin- 
cides with the 2R-WGD early in vertebrate evolution (Kuraku 
et al. 2009). 

Origin of Acanthopterygian paxGa Genes 

The study by Ravi et al. suggested the origin of acanthopter- 
ygian pax6a (pax6.3) genes in the 2R-WGD (Hypothesis 1 in 



Fig. 6. — Continued 

(red), respectively. {B) Heat map visualizing expression levels of PaxG and -70 in individual organs of adult zebrafish, Xenopus, and green anole. In adult 
zebrafish, paxGa, -6b, and - lOa transcripts were detected in the brain, testis, and eye. High levels of PaxG expression were detected in Xenopus laevis brain 
and eye, and low levels in pancreas and intestine, whereas PaxlO expression signals were found in the eye and testis. In green anole, Pax6 transcripts were 
detected in the eye, brain, and testis and at low concentrations in pancreas, and PaxW expression was detected in the eye and brain. It should be noted that 
the zebrafish pancreas was not analyzed (NA) because its anatomical structure was not precisely identified. The phylogram on the left reflects the phylo- 
genetic relationships of the genes inferred in figure 2B, whereas that on the right shows the clustering based on their expression levels in various organs. (C- 
R) In situ hybridizations of Pax4, -6, and -10 orthologs in the retinas of adult zebrafish, Xenopus, and the green anole. C-J show expression patterns in the 
retinas of an albino zebrafish, K-N of a green anole, and 0-R of a X. laevis. D, F, H, J, L, N, P, and R are magnifications of the rectangles in A, C, E, G, I, K, M, 
and 0, respectively. All investigated genes were strongly expressed in the inner nuclear layer (i) and weakly in the ganglion cell layer (g). Zebrafish paxGa, -6b, 
and -10a also showed weak expressions in the horizontal cell layer (h in D, F, and H) and zebrafish pax4 transcripts were detected in the outer nuclear layer (o 
in J). It was evident from the results for Anolis carolinensis that the PaxG expression is nested within that of PaxlO in the inner nuclear layer of the retina 
(arrows in L and A/). Scale bar: 200 jim. 
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fig. 2A). Evidence for this hypothesis was: 1) "Subpartition- 
ing" of conserved noncoding elennents, 2) the absence of 
exon 5a fronn paxGa (pax6.3) genes, and 3) a simple molecular 
phylogenetic analysis with the neighbor-joining method and 
excluding Pax4 genes (Ravi et al. 201 3). The first evidence, that 
is, the divergence or loss of c/s-elements between acanthop- 
terygian paxGa ipaxGJ) and other gnathostome PaxG genes, 
can be explained by subfunctionalization and asymmetric re- 
tention of expression domains after the gene duplication. It is 
a recurrent pattern in gene family evolution that one dupli- 
cate, here the acanthopterygian paxGb, retains the ancestral 
role, whereas the other duplicate, paxGa ipaxGJ) in this case, 
is more prone to changes in their functions (reviewed in Prince 
and Pickett 2002). The sole reliance on presence and absence 
of conserved noncoding elements, without any tree-based 
approach, might therefore mislead the reconstruction of phy- 
logenetic relationships. Regarding the second evidence, Ravi 
et al. regarded the lack of exon 5a in acanthopterygian paxGa 
ipaxGJ) genes as the evidence of exclusion of those genes 
from the canonical PaxG genes (Ravi et al. 201 3). On the other 
hand, however, they report "N-terminal protein extension" of 
all teleost paxG genes (including both paxGa and -Gb), which 
can be regarded as evidence of phylogenetic proximity be- 
tween paxGa and -Gb. Thus, the comparison of gene struc- 
tures does not add to our understanding of the evolution of 
PaxG genes. As for the third evidence mentioned above, we 
reanalyzed the phylogeny of the entire Pax4/6/1 0 class by em- 
ploying a more rigorous methodology in light of the standard 
procedure of molecular phylogenetics (Anisimova et al. 201 3). 
Although our molecular phylogenetic analysis focusing on the 
Pax gene family did not unambiguously solve the question 
about the timing of the acanthopterygian paxGa-Gb split, ex- 
pansion of the scope of our study to a broader genomic region 
enhanced the power of the analysis (fig. 3). Two of the three 
gene families whose members reside near paxGa and -Gb 
{Depdc7 and Kiaa1549l) unambiguously rejected their more 
ancient origins in the 2R-WGD, and one gene family {WT1) 
also strongly favored their duplication in the TSGD (fig. 3B). 
This strongly suggests that these genomic regions, including 
paxGa and -Gb genes, originated in the TSGD (Hypothesis 2 in 
fig. 2A). Ravi et al. also analyzed conserved synteny (fig. 6B 
and Cof Ravi et al. 2013) but did not recognize the copres- 
ence of the DepdcV, Wt1, and Dnajc24 family members as the 
evidence of the orthology between the zebrafish paxGa 
ipaxGJ in their hypothesis) and the paxGb of the other teleost 
fishes. Another argument supporting Hypothesis 2 is the phy- 
logenetic distribution of teleost paxG genes. According to Ravi 
et al., acanthopterygian lost one of the TSGD-derived dupli- 
cates orthologous to PaxG and is the only taxon retaining the 
2R-derived paxGJ gene (Hypothesis 1 in fig. 2A). This hypoth- 
esis assumes independent losses of its orthologs in all other 
vertebrate lineages (cyclostomes, chondrichthyans, nonteleost 
actinopterygians, coelacanth, and tetrapods as depicted in fig. 
9 in Ravi et al. 2013). Taken together, we strongly argue that 



the genes designated as paxGJ by Ravi et al. are indeed the 
second TSGD-derived gene, orthologous to PaxG, which 
should continue to be called paxGa (Hypothesis 2 in fig. 2A). 

Asymmetric Gene Retention Rates between PaxG and 
Pax4/-10 

In the course of vertebrate evolution, Pax4, -G, and -10 genes 
experienced different patterns of retention between verte- 
brate lineages. Although the PaxG gene is present in all verte- 
brates investigated to date, Pax4 and -10 were lost multiple 
times secondarily in independent lineages (fig. ^B). This asym- 
metry of gene retention was also manifested after the TSGD — 
in each teleost genome that we analyzed, we did not identify 
more than one paxlO gene (fig. 2B). Using a synteny-based 
approach, it was suggested that the zebrafish paxlOa gene is 
a TSGD-derived paralog of the identified acanthopterygian 
paxlOb genes (Ravi et al. 2013). The tetraploid X laevis is 
the only species possessing two copies of both PaxG and -10 
genes (see fig. 2B). 

Although the general trend of unequal patterns of gene 
retention is clear, the exact number of secondary gene losses 
of Pax4 and -10 remains difficult to determine. The certainty 
with which one can infer a secondary gene loss in a given 
lineage depends on the quality of available sequence re- 
sources and the density of taxon sampling. In EnsembI release 
71, the mammalian lineage is represented with 40 genomes, 
including those sequenced with low coverage, whereas only 
one amphibian and six sauropsid genomes are available. 
However, the fact that we identified a PaxG gene in every 
genome we investigated suggests that the absences of Pax4 
and -10 genes are not caused by insufficient sequence infor- 
mation, but rather by secondary gene losses. In contrast, the 
absence of PaxlO genes from the mammalian lineage can be 
inferred with a high degree of confidence. By investigating 
synteny between the anole Pax/O-containing genomic 
region and orthologous regions in mammalian and avian ge- 
nomes, we demonstrated that different modes of gene loss 
caused the absence of PaxlO in these two lineages. Although 
we identified a trace of small-scale loss in mammalian ge- 
nomes, loss of a larger genomic segment was suggested in 
the avian lineage (fig. 5). 

Functional Diversification of Pax4, -6, and -10 Genes 

An explanation for the asymmetric gene retention between 
PaxG and the other two paralogs, Pax4 and -10 genes, might 
be rapid subfunctionalization soon after the 2R-WGD in which 
only PaxG retained the essential role as master control gene for 
eye development. This crucial role could have imposed a high 
selection pressure on the PaxG gene making it indispensable. 
Comparison of Pax/ 0 expression patterns among a teleost fish 
(zebrafish), an amphibian (X laevis), and a reptile (the green 
anole lizard) allowed us to corroborate this hypothesis and 
infer the ancestral expression profile of the euteleostome 
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Pax10 gene. We did not identify significant PaxlO expression 
in early embryos of the three examined species, and quantifi- 
cation of Pax yo transcripts of a developmental series of zebra- 
fish suggested the onset of paxWa expression during later 
developmental stages (fig. 6A). This finding is in accordance 
with a previous study reporting an increasing expression level 
of zebrafish paxlOa from 1 dpf to 5dpf (Ravi et al. 2013). 
Morpholino depletion for paxWa in zebrafish causes a reduc- 
tion in eye size (Ravi et al. 2013), indicating a restricted role of 
paxlOa to eye growth. The crucial role in the initiation of eye 
development is presumably retained solely by the Pax6 gene. 

Our semiquantitative RT-PCR of adult zebrafish, Xenopus, 
and green anoles confirmed the Pax6 expression in the eye, 
pancreas, and brain (fig. 6B), as described previously (Nornes 
et al. 1 998; Moreno et al. 2008). Exceptions are the identified 
expression signals in the testes of PaxG as well as of Pax10. This 
expression domain has not been described for any Pax4 or -6 
gene before. Excessive gene expression in testes can be ex- 
plained by a permissive state of chromatin in testes owing to 
the peculiar properties of transcription in meiotic and post- 
meiotic cells ("out of testis" hypothesis; Kaessmann 2010). In 
light of this hypothesis, these identified transcripts might 
merely be caused by leaky transcription. Results of our 
semiquantitative RT-PCR revealed expressions of PaxW 
genes in the eyes of all three investigated species and also in 
the brain tissue of zebrafish and the green anole, but not 
X. laevis (fig. 6B). Our inference based on the maximum par- 
simony principle suggests that the PaxlO gene in the ancestral 
euteleostome was expressed in the eyes, testes, and brains of 
adults. The elephant shark PaxW gene also shows expression 
in mature eyes (Ravi et al. 2013), indicating that the PaxW 
expression in the adult eyes was established before the split 
between the euteleostome and chondrichthyan lineages. On 
the other hand, we observed among-lineage differences in 
the PaxW expression: Expression in the adult kidney is 
unique to the elephant shark, whereas those in the adult 
brains and testes are confined to euteleostomes (fig. 6B; 
Ravi et al. 2013). 

The expression signals of Pax4, -6, and - /O in the adult eyes 
are detected in specific layers in the retina (fig. 6C-/?). The 
retinal expression of PaxG genes in the inner and the ganglion 
cell layer identified in this study is equivalent to that in other 
vertebrate species (reviewed in Osumi et al. 2008). However, 
we did not detect Anolis and Xenopus PaxG transcripts in the 
horizontal cell layer (fig. 6K-R), an expression domain 
previously reported for several vertebrates, including an elas- 
mobranch shark, zebrafish, chicken, and mouse (Belecky- 
Adams et al. 1997; Macdonald and Wilson 1997; 
Wullimann and Rink 2001; Ferreiro-Galve et al. 2011). We 
report for the first time expression of a Pax4 gene, namely 
zebrafish pax4, in the inner nuclear and the ganglion cell layer 
in addition to the outer nuclear layer (fig. 67) as reported in 
mammals (Rath et al. 2009). The expression of PaxW is almost 
identical to PaxG, except that the PaxG expression is nested 



within the broader area of Pax/O-positive cells in Anolis 
(fig. 6L-N). The expression of zebrafish paxWa appears to 
alter in the course of development. Although the paxWa ex- 
pression is restricted to the inner nuclear layer at 2 dpf (Ravi 
et al. 2013), it expands to the horizontal and ganglion cell 
layer in the adult retina (fig. 6C and D). A parsimonious re- 
construction of the nondevelopmental expression profile indi- 
cates a role of PaxW in the mature retina and brain of the 
euteleostome ancestor (fig. 7). 

Although the proto-Pax4/6//0 gene in the vertebrate an- 
cestor had a dual role in the visual system, namely an early one 
in eye morphogenesis and a late one in mature photorecep- 
tors, these roles could have been subdivided during vertebrate 
evolution. The vertebrate PaxG gene retained the crucial role in 
eye morphogenesis, whereas Pax4 and -W as well as PaxG are 
coexpressed in the mature vertebrate retina (fig. 7). 

The redundancy resulting from the quadruplication of a 
proto-Pax4/G/W gene in the 2R-WGD presumably led to a 
functional partitioning among the four paralogs early in ver- 
tebrate evolution. In order to understand these secondary 
changes in associated developmental pathways, the ancestral 
expression profile of the proto-Pax4/6//0 gene needs to be 
reconstructed and compared with those of individual Pax4, -6, 
and -W genes in extant vertebrates. This reconstruction of the 
ancestral state should be reinforced by comparing expression 
patterns of their orthologs of nonvertebrate deuterostomes 
with those of protostomes. So far, sparse information of ex- 
pression patterns in adults for most of invertebrates does not 
allow a reliable reconstruction of nondevelopmental roles 
of the ancestral Pax4/G/W gene. However, expression of an 
eyeless isoform in adult Drosophila eyes, more precisely in 
photoreceptors, has been reported (Sheng et al. 1997). The 
reconstruction of the ancestral expression profile of Urbilateria 
suggests a role of the proto-Pax4/6//0 gene in the developing 
visual and olfactory systems, the CNS, and mature photore- 
ceptors (fig. 7). 

Pancreatic expression of Pax4 and -6 was reported in mam- 
mals (Turque et al. 1 994; Sosa-Pineda et al. 1 997) and teleost 
fish (Thisse and Thisse 2004; Manousaki et al. 2011). The 
single Pax4/G/W proto-ortholog in amphioxus lacks expression 
in the possible pancreas homolog (Reinecke 1981; Glardon 
et al. 1 998; Sun et al. 201 0). After the split of cephalochordate 
and tunicate lineages, the proto-Pax4/G/W should have 
gained the pancreatic expression domain in the vertebrate 
ancestor before the 2R-WGD. Thus, expression of Pax4 and 
-6 in the pancreas presumably represents a synapomorphy of 
vertebrates. On the other hand, the vertebrate PaxW gene, 
the third paralog, presumably did not retain the pancreatic 
expression domain (fig. 7). 

Another interesting feature of the newly identified PaxW 
gene is the lack of a paired domain in the deduced protein 
sequence (fig. 1/\; Ravi et al. 2013). Although no other Pax 
gene has ever been shown to completely lack the eponymous 
paired box, there are reports of "paired-less" PaxG isoforms 
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Fig. 7. — Evolutionary scenario focusing on the functional diversification of the Pax4/6/1 0 class of genes. Expression domains identified in this study are 
mapped with those previously described onto a simplified gene tree of the Pax4/6/1 0 class. Based on parsimonious reconstruction, a proto-Pax4/6//0 gene at 
the last common ancestor of protostomes and deuterostomes, the so-called Urbilateria, was most likely expressed in photoreceptors, olfactory placode, 
developing the eye and CNS. Secondary modification, such as the gain of a pancreatic expression before the 2R-WGD or the loss of several Pax4 and -10 
expression domains, led to the functional differentiation among Pax4, -6, and -10. 



through alternative splicing in quail (Carriere et al. 1993), 
mouse (Kammandel et al. 1999), and zebrafish (Lakowski 
et al. 2007), as well as the ortholog in Caenorhabditis elegans, 
vab-3 (Zhang and Emmons 1995). It was shown in mice that 
paired-less Pax6 is one of the three Pax6 isoforms whose tran- 
scriptions are initiated from three alternative promoters 
(Kammandel et al. 1999) and is expressed in a tissue-specific 
manner (Mishra et al. 2002). The structural similarity between 
paired-less Pax6 isoforms and the PaxlO gene is intriguing. It 
will be interesting to disentangle the functional interplay on 
the cellular level among the paired-less Pax10 gene, isoforms 
of Pax6 and also Pax4. Teleosts, in particular zebrafish, would 
be an ideal system as they are amenable to genetic experi- 
ments and have retained orthologs of Pax4, -6, and -10. 
Chromatin immunoprecipitation combined with sequencing 
(ChlP-seq) experiments could potentially reveal differences 
and overlaps in the target DNA of these transcription factors. 
Knockout (or -down) experiments of paxW could reveal to 
what extent its sister genes might compensate for its loss of 
function and thereby reveal potential redundancy. This exper- 
iment might allude to possible consequences that a secondary 
gene loss of Pax4 and/or -10 might have had on associated 
regulatory pathways. 

Supplementary Material 

Supplementary figures SI and S2 and tables S1-S9 are avail- 
able at Genome Biology and Evolution online (http:/AAAAAA/. 
gbe.oxfordjournals.org/). 
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